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NONPARAMETRIC ESTIMATION OF A POINT-SPREAD 
FUNCTION IN MULTIVARIATE PROBLEMS 

By Peter Hall^ and Peihua Qiu^ 

Australian National University and University of Minnesota 

The removal of blur from a signal, in the presence of noise, is 
readily accomplished if the blur can be described in precise mathe- 
matical terms. However, there is growing interest in problems where 
the extent of blur is known only approximately, for example in terms 
of a blur function which depends on unknown parameters that must 
be computed from data. More challenging still is the case where no 
parametric assumptions are made about the blur function. There has 
been a limited amount of work in this setting, but it invariably relies 
on iterative methods, sometimes under assumptions that are mathe- 
matically convenient but physically unrealistic (e.g., that the operator 
defined by the blur function has an integrable inverse). In this paper 
we suggest a direct, noniterative approach to nonparametric, blind 
restoration of a signal. Our method is based on a new, ridge-based 
method for deconvolution, and requires only mild restrictions on the 
blur function. We show that the convergence rate of the method is 
close to optimal, from some viewpoints, and demonstrate its practical 
performance by applying it to real images. 

1. Introduction. Observed signals are usually not exactly the same as 
true signals, but are instead degraded. This can occur through the entire 
process of signal acquisition, for a variety of reasons. For example, in aerial 
reconnaissance, astronomy and remote sensing, signals are often adversely 
affected by atmospheric turbulence or aberrations of the optical system. 
Signal degradations can be classified into several categories, among which 
point degradation (or noise) and spatial degradation (or blur) are the most 
common. Other types of degradation involve chromatic or temporal effects. 
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For a detailed account of the formation and nature of degradations the reader 
is referred to books such as those by Andrews and Hunt [1] and Bates and 
McDonnell [2]. Related discussion is also given by Qiu [18]. 

In image analysis the true signal is often observed, or scanned, on a two- 
dimensional pixel grid, subject to both noise and blur. More generally, a 
signal may be recorded in any number of dimensions. For example, Lidar 
imaging devices record in d = 3 dimensions, and a great deal of signal anal- 
ysis is conducted in the case d=l. 

In these settings it can be considered that we observe 

(1.1) Y{j) = {4>ij){j)+N{j), 

where tp denotes the true signal, represents noise, is a linear operator 
applied to ip and Y is the noisy signal. The latter is acquired on a d-variate 
square lattice, and therefore j, in (1.1), is a member of the set Z'^ of all 
d- vectors of integers. We shall use the symbol cf) to denote also the kernel of 
the operator (p; this function is sometimes referred to as the blur function. 
Thus, = Y.k (PU - k) tp{k), for each j G Z'^. 

In an image-analysis interpretation of (1.1), ip denotes the true scene, Y 
is the observed image, the function cp is called a point-spread function, Z'^ 
is a mathematical representation of the pixel grid on the Charge Coupled 
Device (CCD) and of course, d = 2. In this setting, and also more generally, 
we expect (p to preserve signal intensity, that is, ^j(p{j) = 1- In particular, 
this implies that ii ip = b for a constant 6, meaning that the true signal is of 
constant "brightness," then (p^p = ip . 

Image restoration (when d = 2), or, more generally, signal restoration, is 
a process for reconstructing a close approximation to the unobserved sig- 
nal ^p from its observed but degraded form, Y. Many procedures for image 
restoration assume that (p is known. This is the case with, for example, the 
inverse filter, Wiener filter, constrained least-squares filter, Lucy-Richardson 
procedure, Landweber procedure, Tikhonov-Miller procedure, maximum a 
posteriori (MAP) procedure, maximum entropy procedure and techniques 
based on the EM algorithm. See, for instance, [20], [11], Chapter 5, [4] and 
[10]. In some settings this is reasonable, since (p can be specified, at least ap- 
proximately, using our knowledge of the signal acquisition device. However, 
in other applications this information is not available, and so approxima- 
tion (or estimation) of is a prerequisite for image restoration. This is the 
context of the present paper. 

Signal restoration when <p is unknown is referred to as blind signal restora- 
tion. A number of procedures have been proposed for solving this problem. 
They can be grouped into two categories. In the first, cp is described by a 
parametric model, usually with just one, but occasionally two, parameters. 
See, for example, the work of Cannon [3], Katsaggelos and Lay [15], Ra- 
jagopalan and Chaudhuri [19], Carasso [5] and Joshi and Chaudhuri [14]. 
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The other class of procedures assumes that the true signal consists of an 
object with specific, known features — for example, a shape with known sup- 
port, against a uniform background — but involves only weak, nonparametric 
assumptions about (j). See, for example, the contributions of Yang, Galat- 
sanos and Stark [22] and Kundur and Hatzinakos [16]. 

It is to the latter category that we contribute in this paper. We intro- 
duce a method which, working from a known test signal and making only 
mild, nonparametric assumptions about the blur function, recovers the lat- 
ter without suffering the drawbacks of earlier nonparametric techniques. In 
particular, the mechanism leading to the observed signal is not precisely 
known because we lack information about the blur function, rather than 
about the true signal. 

Using our technique, the blur function does not need to have an integrable 
inverse, or reciprocal. The latter assumption will very seldom be satisfied 
in practice, although it is made in recent, related literature. Moreover, our 
technique is substantially less complex than the iterative approaches which 
are invariably used in nonparametric settings. 

We introduce a new, ridge-based deconvolution algorithm. Unlike conven- 
tional methods, this technique is well suited to inversion when the Fourier 
transformation of the point-spread function vanishes at infinitely many points 
Standard approaches to dealing with this problem sometimes resort to "fenc- 
ing off" those zeros, and then dealing separately with each one. That can 
be particularly awkward, and is avoided by our ridge-based method. In ad- 
dition to having good numerical performance, the ridge technique achieves, 
in some settings, theoretically optimal convergence rates, and so is no less 
"sharp" than its more conventional competitors. 

Our theoretical work is related to earlier contributions of Hall [12], John- 
stone and Silverman [13], Donoho and Low [7], Donoho [6], Van Rooij, 
Ruymgaart and van Zwet [21] and Ermakov [8]. These authors, in a variety 
of settings, discuss consistency and convergence rates for inverse estimators 
computed from unknown signals and from known blur functions. 

The method is introduced in Section 2. Some of its theoretical properties 
are discussed in Section 3. A numerical study in Section 4 describes our 
method's statistical features and its application to real images. Technical 
details are deferred to Section 5. 

2. Models and estimators. 

2.1. Model for degraded, noisy signal. We assume model (1.1) through- 
out. The noise, N , is taken to be independent and identically distributed 
at each lattice point j, with variance > 0. We suppose that (f) preserves 
signal intensity. 
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We might think of the pixel-based blur function cp as representing a dis- 
crete approximation to an idealised, smooth blur function, g say, which 
operates in the continuum. If the pixel width is considered to be n~^, where 
n > 1 is an integer which we shall permit to become arbitrarily large, then 
the relationship between <j) and g might be taken to be 

0(j) = n"'' f{j/n) for all j € Z"^ and 

(2.1) 

f{x) = Sdg{x) for all x G M*^, 

where we might take the function g to be fixed (i.e. not depending on n) and, 
if (f) preserves intensity (e.g., preserves the light energy striking the CCD in 
a typical imaging device), the scaling factor Sd satisfies 

(2.2) Sd=\n-''Y.9{j/n)\ -1 

as n — > oo. The limiting relation in (2.2) holds because J g = 1, this being 
the continuum version of (p{j) = 1. Thus, / is a normalized version of g, 
on the pixel grid, and (/> is a discretized version of /. 

The suggestion that g he a fixed function is made here only to simplify 
our ideas. In our subsequent theoretical work we shall, through analogous 
changes to (p, permit the spread of g to alter with n, so that the difficulty 
of the imaging problem can evolve as the amount of information changes. 

We shall take / to be supported on the sphere of radius A„/n. It follows 
that g is supported on the same set. 

2.2. Model for test signal. In the case d = 2, test signals, or test patterns, 
are frequently used to determine a point-spread function from data. Test 
patterns are images that are known to significantly greater accuracy than 
that provided by the image recording device under test. In fact, test patterns 
are generally known completely; there is no need to estimate parameters, 
and in this sense the term "parametric image model" would be misleading 
if it were applied to a test pattern in a narrow statistical sense. In practice, 
performance is often assessed visually; in this paper we use mathematical 
closeness in the L2 metric in lieu of subjective assessment. 

Real test patterns are typically comprised of regular geometric shapes, 
such as rectangles. We shall treat such a signal here, in the d-variate case, 
although to simplify notation and discussion we shall assume that there is a 
single rectangular prism, mj pixels wide along the jth axis for j = 1, . . . ,d. If 
the sides of the prism are parallel to the pixel axes, if the lower left- and upper 
right-hand corners of the rectangle are at (oi, . . . , a^) and (61, . . . , 6^), respec- 
tively, and if the value of the signal is 1 within the rectangular prism and 
outside, then il){ki,. . . , kd) equals 1 if aj < kj < bj for 1 < j < d, and equals 
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zero otherwise. It follows that, with t = {ti,. . . ,td)^ and j = (ji, . . . 
we have 



(2.3) 




1=1 



sin(i£/2) 



where mi = bi — ae + 1, and the superscript Ft denotes the discrete Fourier 
transform. 

If, as in the discussion of (2.1) and (2.2) in Section 2.1, we consider the 
lattice Z*^ to represent a rescaled pixel grid where neighbors are, in reality, 
distant rather than 1 apart along each axis, then it is reasonable to con- 
sider m£ to be asymptotic to cgn, where in this instance we take q > to be 
fixed as n diverges. In this way the rectangular mi x • • • x prism repre- 
sents, as n diverges and scale is suitably adjusted, an increasingly accurate 
approximation to a prism with edge lengths ci, . . . , 

2.3. Discrete Fourier transforms. Assume that (j) vanishes outside a known 
sphere TZ = TZ{n) in Z*^, centred at the origin, O, and of radius A„, where 
n/\n is bounded; and that ip likewise is zero outside a known set S, which 
extends no further than radius 0{n) from O. Put T = TZ®S = {j + k:^ 
TZ,k € S}. Then (j)^p vanishes outside T, and 

jezd jell jes 

and = ^^^i>^\ 

In a slight abuse of notation we denote by Y^^ and A^^'' the Fourier 
transforms of Y and N restricted to T, 

ier jer 

Therefore, a Fourier-transform version of (1.1) has the form 

(2.4) Y^\t) = (l/\t)^^\t) + N^\t), tGM'^. 

Result (2.4) highlights the symmetry of the problem: In principle, identical 
methods can be used to recover (p from Y knowing and to recover ip 
knowing (j). However, a marked degree of asymmetry is often introduced 
through the typical forms of (p and ip. Again the problem of image analysis 
provides a convenient example. There, when the point-spread function (p 
is known, and the problem is that of estimating the true scene, then (p 
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is generally smooth, and in particular cf) (t) generally converges relatively 
quickly to zero as \\t\\ increases. (Here, || • || denotes the Euclidean metric 
on M*^.) On the other hand, when the true scene is known, for example a 
test pattern, and the problem is one of estimating the point-spread function, 
■0 is often unsmooth. In particular, as indicated in Section 2.2, 'ip contains 
jump discontinuities, representing the sharp boundaries in a test pattern. In 
such cases, ilj^^{t) generally converges to zero relatively slowly. Of course, 
there are exceptions to these generalities; for example, if (p denotes the point- 
spread function corresponding to motion blur then it is unsmooth. 
We shall concentrate on the problem of estimating (p from known ^p. 

2.4. Estimation of 4> from known ip. Let p{t) denote a positive constant 
multiple of a known, positive function of the real variable, t. We use p{t) 
as, in effect, a ridge when regularizing a Fourier transform. In particular, 
recognizing that (fF^ = {(pip)^'^ / ip^^ and therefore 



Here, r > 0; choosing r > removes potential numerical problems associated 
with computing the integral in (2.6). 

We may think of (2.6) as having been obtained from (2.5) by (a) mul- 
tiplying the numerator and the denominator in the integrand of (2.5) by 
ip^^ {—t)\ip^^ {t)Y ; (b) replacing |'i/;^''(t)| by the maximum of that quantity 
and the ridge, in the quantity |^^*(t)|''"'"^ which step (a) produces in the 
denominator; and (c) replacing {<pip)^'^{t) in the numerator by its unbiased 
approximation, Y^^{t). 

In some cases, considerations of symmetry in the process for manufac- 
turing the signal recording device imply that, to a first approximation, cp is 
radially symmetric. For example, glass (as distinct from resin) lens elements 
are typically manufactured using a polishing process which involves rolling 
a large sphere, with cylindrical glass blanks attached, inside another sphere. 
However, errors in this process can introduce asymmetric aberrations to 
such elements, in particular because the outer, grinding sphere is worn, or 
the glass blanks are not correctly secured. Other causes of asymmetry result 
from inaccuracies in the alignment of elements within the lens, or in the ce- 
menting of lens elements together. Since the design of a lens is often highly 
complex, there are many different ways in which asymmetric aberrations can 
arise, and no standard parametric models for the blur functions that they 
might produce. 



(2.5) 




where A 



7r,7r]^, we define an estimator (p oi cp hy 



(2.6) 
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3. Theoretical properties. 

3.1. Mean-squared error criteria for choosing the ridge. We define the 
sum of squared errors of (^(j) to be 

ssE= y: m-m\'- 

From this formula and the definition of at (2.6), it follows that the 

mean summed squared error (MSSE) admits the formula 

MSSE = £(SSE) = <rH*T)j^ fAwnhv^} 

(3.1) 



(27r)rf7^'^ ' I (|V.'Ft| Vp)'''+2, 

The first and second terms on the right-hand side represent the total con- 
tributions to mean summed squared error from variance and squared bias, 
respectively. For example, the first term on the far right-hand side equals 

y E\4>{j) - E^{j)\^ = I [-^ff^I^^XE\N'''\\ 

'^^-^^ (27r)^;^l(|V'Ft| Vp)-+2J I I' 

and the claimed relationship follows from the fact that E\N^^\'^ =cj2(#T). 

We suggested in Section 2.1 that the lattice be interpreted as a rescaled 
version of a pixel grid with edge length n~^. We claim that in this setting it 
is appropriate to work with n'^MSSE, rather than directly with MSSE. To 
appreciate why, recall from (2.1) and (2.2) that n'^cl){j) can be interpreted 
as a discrete approximation, on a pixel grid, to a continuous blur function 
/ evaluated at j/n. In this context, f{j/n) = n'^cpij) can be viewed as an 
estimator of f{j/n) and extended to M.'^; and ji'^MSSE can be interpreted 
as a discrete approximation to the mean integrated squared error of / as an 
approximation to /. 

3.2. Asymptotic properties of cf^^ and Reflecting the rescaling dis- 

cussed above, define (l)^{t) = (j/^{t/n), ipn^it) = n~'^ip^^{t/n), p„ = n~'^p, 
An = [—mr,mr]'^ and r = n~'^(^T). In this notation, 



n'^MSSE : 



1 M W 



/.Ftlr+l 



(27r)'^7^a(|V^Ft|vp„)r+2 

(3.2) +J-.f \£'\^U-^M''^^' 



Since n'^MSSE can be represented so simply in terms of (p^^ and ip^^, then 
it is of interest to know properties of those functions. 



8 



P. HALL AND P. QIU 



We shall work with classes of compactly-supported blur functions (j) for 
which the associated, rescaled Fourier transform, c/)^', decreases at least 
polynomially fast in the tails. In general it is awkward to prove that such 
a rate of decrease occurs arbitrarily far out in the tails, but fortunately we 
need it only a distance o{n) from the origin. 

Performance is determined by three main parameters: n, the number of 
observations per linear unit of space; cr^, noise variance; and A„, blur radius. 
Arguably the first two of these are the most intrinsic, although also plays 
a major role. 

These considerations lead us to define the following class of sequences of 
blur functions. Given a sequence A = {An} of positive numbers, and p > 0: 

Let J^{A,p) denote a class of sequences of functions <j) depending on n, 

with the following properties for each given n: (a) (j) vanishes outside a 

d-variate sphere, centered at the origin, of radius A„; (b) J2j4'U) — 1; 

(3.3) and (c) for each positive sequence £n decreasing to zero as n — > oo, 
(1 + ||i||)*'|'^n*(i)| is bounded uniformly in t € An with ||t|| < ne„ 
and in (/) G T{A,p), and |</>.}i^*(t)| = 0{(ne„)~'^} uniformly in t E An 

with ||t|| > ne„ and in e T{A,p). 

Here and below, "(/> E J^{A,p)" means that the sequence of blur functions 
for which the function, at "time" n, is (j), is in T{A,p). Thus, (p depends on 
the pixel scale-factor n, although to prevent ambiguity we indicate this in 
notation only for the Fourier transform (p^^ of the rescaled version of (p, not 
for (f) itself. 

To illustrate, we introduce a function which satisfies the conditions 
in (3.3). The function class J^{A,p) could be taken to be a set of rescaled 
versions of this (p, but of course it can be much larger. 

Consider a compactly supported, continuum blur function, g{x) = Ai 11^(1 ~ 
xf)^ for sup^ \xi\ < 1. The constant Ai > is chosen so that the function in- 
tegrates to 1 on [—1, 1]"', or equivalently, so that signal intensity is preserved. 
The associated characteristic function, 

g^\t) = AJ e^'^4{{{l-x}f\dx, 

J x:svix>i\xi\<l 1^=1 J 

satisfies \g^^{t)\ < ^2(1 + 11*11)"^, for all t G R'^, where > is a constant. 
The discrete blur function (p analogous to g is 

(3.4) 4>{j)=A2{n)n-''Y[il-n-^j!r 
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for sup^lj^l < n, where the bounded sequence A2{n) is chosen to preserve 
signal intensity. In this case, A„ = 0{n); that is, <j) is supported within a 
sphere of radius n of the origin. For each sequence i there exists a 
constant ^3 > such that the corresponding 0^* satisfies |</>J^'^(t)| < ^3(1 + 
\\t\\)~P for aU ||t|| <nen, and also \(f>l^{t)\ <^3(ne„)-'^. 

The signal model introduced at (2.3) admits a concise asymptotic descrip- 
tion. Let us, in (2.3), take C£ = mi/n; then 

(3.5) ^«(<)=,«<n^!HM^, 

where On € M'^. If q is either fixed or converges to a finite, nonzero number 
as n— >oo, then V'n*(^) asymptotic to V'u^(i) = n^{2t^^ sin(Qt£/2)}. It 
follows that, for each sequence e„ | 0, |^^*(t) — V'um(*)l/l^um(OI ~^ uni- 
formly in ||t|| < ne„, and |'0n*(OI — 0{(ne„)~'^} uniformly in t S An for 
which ||t|| > nSn- 

3.3. Upper hound to rate of convergence of MSSE. Our main result is the 
following. Define J-{A,p) as at (3.3). The formula for the threshold, Pnit), 
given there can be changed without appreciably altering the results. For 
example, the theorem continues to hold if, in the expression for pn{t), we 
replace O^d^^l V 1) by simply 1, strengthen the condition p > ^{d + q) to 
p> d + and weaken the assumption q> 3d to q> 2d. 

Theorem 3.1. Assume that n/Xn is bounded as n^oo, and that the 
noise variance, a"^ = a"^, satisfies n~'^\'^a'^ — > as oo. Let hn denote a 
positive sequence decreasing to zero, put Pn{t) = hniJJiHtel V 1)}~^ take 
r >0 in the definition of cj){j) at (2.6) and assume that p > ^{d + q) and 
q > 3d. Then, as oo, 

(3.6) sup n'^MSSE = 0{in~'^X'^alh;;^ + hn)ilogn)'^^^}. 

Remark 3.1 [Optimizing choice of hn)- The theorem implies that a 
mean-square convergence rate of essentially (A^c^/n'^)^/^, uniformly over 
4> € J^{A,p), can be achieved by taking /i„ x (A^cr^/n'^)"^/^: 

(3.7) sup n'^MSSE = 0{{X'},a^Jn^y/\lognf-^}. 
</.e.F(A,p) 

The notation On^hn, for positive a„ and hn-, means that On/bn is bounded 
away from zero and infinity as n — > oo. 

Remark 3.2 (Smoothness of <j))- The convergence rate in (3.7) does 
not depend on the smoothness of (f), represented by p in the function class 



10 



p. HALL AND P. QIU 



J^{A,p), provided p exceeds ^{d + q). It is of interest to consider what this 
means in terms of the number of derivatives enjoyed by the blur functions. 
Let us take q = 3d+, that is, just a httle larger than 3d. Then the condition 
p> ^(d + q) reduces to p> 2d. If T{h.,p) is sufficiently large, for example 
if it contains a scale-changed version of the cj) defined at (3.4), then the 
assumption that all the blur functions in J^{A,p) have s square-integrable 
derivatives is equivalent to asking that p> s + ^d. In this setting the smooth- 
ness condition imposed in the theorem reduces to the restriction that all the 
functions in the class have d bounded derivatives. In the important special 
case of image analysis, d = 2 and just two derivatives are required. 

Remark 3.3 (Smoothness of If the test signal, V) is & relatively 
smooth function, then the mean-square accuracy of even an optimal ap- 
proximation to (j) can be inferior to the rate in (3.7). For example, taking 
A„ = n, (T^ = and d = 1 for simplicity, the rate in (3.7) is n~^^'^. However, 
assuming that iV'n*! decreases like (1 + ||t||)~'^ as ||t|| diverges, the minimax- 
optimal rate of convergence of mean-squared error in estimation of (j) can 
be shown to equal n~'^^^^'^P~^'^^~^^\ (See [9] for related results in density de- 
convolution problems.) This is inferior to the rate n~^/^ unless p > s + ^. 
Therefore, if the test signal is very smooth, the blur function must be even 
smoother if the accuracy of the estimator of the blur function is not to be 
degraded relative to that for a rough test signal. 

3.4. Lower bound to rate of convergence of MSSE. Let / denote a fixed, 
spherically symmetric, compactly supported probability density on M"^, for 
which 

(3.8) sup(l + ||t|I)^'|/^*(t)|<cx), 

where p > 0. Let ^ be a d- vector, and put 5n = A~^, 

(3.9) xe{x) = ci^e5if{5nx){l + 9cos{i^x)}, 

where ^ = or 1 and ci^ denotes a constant. (Here, and below, we suppress 
the dependence of quantities such as xe and ci^ on n.) Note that 

5i j f{5nx)cos{(^xy^'^''dx 




Therefore, if we define ^ = l + 6'/^'(^/(^n)) then xe is a probability density. 

Let the blur function (pe denote the conventional discrete approximation 
to Xe, 

(3.10) Mj) = C2,en-\e{i/n), j G Z, 
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where the constant C2,6» is chosen so that intensity is preserved, that is, 
J2j4'e{j) = 1- Under the conditions given in the theorem below, this stan- 
dardization entails C2^e ~^ 1 as n — > oo. Let MSSE^g denote the version of 
MSSE for a general estimator of 4> [not necessarily the estimator at (2.6)], 
when the true (p is (pe find the scale parameter equals n. 

Theorem 3.2. Assume Sn x (o-^/n'^)i/(3d)^ ff^at Cin''^^ < 61"^ < C-in''^^ , 
where Ci, . . . ,C\ > and d{l — p~^) < C4 < C2, and that p > max(|d, 
|C2 + 1). Suppose, too, that the noise variables N{j) are independent and 
identically distributed as Normal N(0,(T^). Then, for a choice of ^ in (3.9) 
that depends only on ci,...,Cd in the definition of the test signal i/j [see 
(3.5)], 

(3.11) liminf(n'^/A^a2)V2 j^^MSSE^e > 0. 

In view of (3.8), the sequence of functions (pe, indexed by n, is J^{A,p) for 
d = 0,1, provided the constant in the uniform bound on (1 + ||t||^) (i)| in 
(3.3) is chosen sufficiently large. In this case, (3.11) implies that 

(3.12) liminf(n'^/(j2)V2 gup n'^MSSE^g > 0. 

Assuming the relation A„ ^ {n'^/a'^)^^^^'^^ between noise variance and sup- 
port of the blur function, and with the exception of the logarithmic factor 
in (3.7), (3.12) is a converse of (3.7). Within these constraints, the estimator 
(j) at (2.6) recovers (j) from the test-pattern data at the optimal rate. 

In the case cr^ = n~^, treated in Remark 3.3, Theorem 3.2 shows that 
the convergence rate achieved is optimal if Xn x ri^'^~^^^^^^^\ This is a more 
realistic assumption than >i n, imposed in Remark 3.3, since it permits 
the number of pixels that represent the width of the blur function to be an 
order of magnitude less than the number per linear unit of space. 

4. Numerical results. 

4.1. Square-block test pattern. Here we summarize the results of a sim- 
ulation study when d = 2, in cases where the true image, represented by 
the function -0, is a simple square block with intensity 1, against a white 
background with intensity 0. See panel (a) of Figure 1. We take the true 
continuum blurring function to be 

(4.1) g{x,,X2) = - (xi/A)2}{l - {X2/Xn 

for sup(|a;i|, 1x2!) < A, and g{xi,X2) =0 otherwise, as suggested in Sec- 
tion 3.2 with p= 5. Note that g is not circularly symmetric; that is, g{xi,X2) 
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Fig. 1. Graphs oftp, 0, (j)^ and Y. Panel (a) shows the true image, ^. Panel (b) shows 
its blurred version, (pip. The function (j> itself is depicted in panel (d). Panel (c) shows 
the blurred image plus noise; the latter was N (0,0.1^) on each pixel. Digitization was on 
a 128 X 128 



is not a function of x1 + X2 alone. (The great majority of parametric models 
for point-spread functions are circularly symmetric.) We denote the dis- 
cretized version of 5 by (p. 
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Table 1 

Mean summed squared error. Tabulated are values of MSSE (the first number in each 
entry) and se of SSE (the second number) of the estimator 4> defined at (2.6), based on 
101 simulations. The optimal value of hn is presented in the second line in each entry. 
The noise distribution is N(0,(j^) 



r 


o- = 0.05 


a = 0.1 


a = 0.2 


1 


1.9744, 0.0607 


2.0987, 0.0750 


2.3457, 0.0887 




1.0 X 10"" 


1.1 X 10"" 


1.2 X 10"" 


10 


1.2046, 0.0779 


1.3708, 0.0474 


1.4961, 0.0954 




2.0 X 10"^ 


3.1 X 10"^ 


3.2 X 10"^ 


50 


0.6397, 0.0324 


0.6644, 0.0531 


0.7533, 0.0874 




1.7 X 10"^ 


1.7 X lO"'^ 


1.7 X 10"^ 


55 


0.6397, 0.0324 


0.6643, 0.0531 


0.7532, 0.0873 




1.7 X 10"^ 


1.7 X 10"^ 


1.7 X 10"^ 



See Figure 1(d) for a perspective plot of (p when n = 128 and A = 0.2. 
Figure 1(b) shows the result of blurring •(/; using (f>. If we add independent 
and identically distributed N(0, cr^) noise to the blurred image at each pixel, 
then we obtain, when o" = 0.1, the result shown in Figure 1(c). 

We evaluated the performance of the estimator cf), defined at (2.6), when 
n = 128, A = 0.2 and a = 0.05, 0.1 and 0.2. For the estimator (p we chose 
Pnit) = hn\\t\\^, which, along with g in (4.1), satisfies the conditions given in 
Section 3.2. There are two parameters, r and hn, involved in the estimator (p. 
We found that, in most cases (e.g., n = 128 or 256, a S [0,1]), results were 
improved when r increased in the range [0, 50] , and they did not change 
much when r was chosen larger than 50. However, when r was chosen too 
large (e.g., larger than 60), numerical underflow sometimes occurred in the 
computations, since the denominator in (2.6) was very small in such cases. 
To demonstrate this, we consider four r values: 1, 10, 50 and 55. For each 
combination of a and r, we searched for the optimal value of /i„ in the range 
[0, 10~^], with step-length 10~^. In this analysis we employed MSSE (mean 
summed squared error) to define optimality, as in Section 3, and used as our 
data the results of 101 simulations. Values of MSSE, the standard error of 
SSE and the optimal value of hn are presented in Table 1. 

From Table 1 it can be seen that: (a) In all cases considered, MSSE values 
are stable when r is chosen larger than 50; (b) MSSE increases with a, but 
the effect of a is quite small; (c) hn should be chosen smaller when r is larger 
or a is smaller. 

We found that, for the smaller sample sizes treated in our numerical work, 
the estimator performed well except that it under-estimated the peak of (p 
a little. This is a common aberration of nonparametric curve and surface 
estimators, which tend to be biased down in peaks and up on troughs. The 
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tendency can be largely removed by making a simple change of scale, 
(4.2) 0(3:1, X2) = 0(xi/s,X2/s), 

where s > is a tuning parameter. 

In practice, all tuning parameters, including s, would be chosen to give 
the best visual impression. This approach is common in image analysis, and 
avoids difficulties that arise when using mathematical criteria that are based 
on L2 performance but do not approximate visual perception particularly 
well. See [17] for discussion. 

Figure 2(a) shows the estimator (j) that has median value of MSSE, out 
of 101 simulations, when a = 0.1, r = 50, /i„ = 1.7 x 10~^ and s = 0.92. Its 
profiles in the cross-sections of X2 = and xi = are shown in Figures 2(b) 
and 2(c), respectively, by the dotted curves. In these two plots, the solid 
curves denote the profiles of the true point-spread function (p, and the short 
and long-dashed curves denote the profiles of the estimator (p having median 
value of MSSE, out of 101 simulations, when r = 50 and a = 0.05, 0.1 and 
0.2, respectively. 

4.2. Application to cameraman image. To illustrate how our methodol- 
ogy affects the restored image in the entire image restoration process, we 
used the popular cameraman image as an example. The original image is 
shown in Figure 3(a); it is of size 256 x 256 pixels, with gray levels in the 
range [0, 255] . A blurred version of this image, using the point-spread func- 
tion g at (4.1) with A = 0.05 (i.e. with a 25 x 25 pixel blurring window) is 
shown in Figure 3(b). Figure 3(c) depicts the image that is obtained after 
adding independent and identically distributed N(0, 5^) noise to the image 
in Figure 3(b). 

We pretended that these images were made by the same image acquisition 
device as that for the test image shown in Figure 1. Then the point-spread 
function, (p, was estimated by (2.6) and (4.2) from the degraded test image, 
using the same level of blurring and noise as for the degraded cameraman 
images. We fixed r at 50, as before. 

There are several existing procedures for restoring ■0 from Y, if is known 
or estimated. We chose two noniterative procedures: the inverse filter with 
a hard threshold, and the Wiener filter. The restored image computed by 
the first approach is given by 

(2^^/ 1 ^^^(l<^^*WI>7)exp(itTx)di|, 

where 4> denotes the estimated point-spread function, and 7 > is the thresh- 
old. The restored image obtained by the second approach is defined by 
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(0 

Fig. 2. Graphs of cj>. Panel (a) shows a plot of (f> when a = 0.1, r = 50, fen = 1.7 x 10~^ 
and s — 0.92. Panels (b) and (c) s/io«; profiles in the cross-sections of X2 ~0 and xi = 0, 
respectively, of (f> (solid), (j) when a = 0.05 (dotted), (j) when o = 0.1 (short- dashed) and (jj 
when a = 0.2 (long-dashed). In each case, the estimator (j> has median value of MSSE, out 
of 101 simulations. 




where (fxl^ denotes the complex conjugate of (/> , and a, /3 > are two param- 
eters. The inverse filter is basically the least-squares procedure; use of the 
threshold alleviates noise amplification. The Wiener filter is derived with 
a view to minimizing MSSE of the restored image under the assumption 
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Fig. 3. Cameraman example. Panels (a)-(c) show the original, the blurred, and the 
blurred- and-noisy cameraman images, respectively. Panels (d) and (e) show images re- 
stored from (c) by the inverse filter and the Wiener filter, respectively. Panel (f ) shows 
the restored image, obtained by the Wiener filter, when a Gaussian point-spread function 
with standard deviation A/2 was used in deblurring. 

that noise is Gaussian. These two approaches are used commonly in the 
hterature. See [11], Chapter 5, for detailed discussion. 

The restored image, obtained by inverse filtering from the blurred and 
noisy cameraman image, is shown in Figure 3(d). The corresponding results 
for Wiener filtering are given in Figure 3(e). In each case the tuning pa- 
rameters, a, /3, hn = 10~^ and s = 0.89, were selected to give a good visual 
impression. 

Next, instead of estimating the point-spread function as suggested at 
(2.6), we assumed that the Wiener filter used a Gaussian point-spread func- 
tion with its standard deviation equal to A/2. (This produces virtually the 
best results in the Gaussian case. Note that the radius of the Gaussian 
point-spread function is effectively twice its standard deviation, and that 
the radius of the correct point-spread function equals A.) The corresponding 
result is shown in Figure 3(f). It can be seen that this mistaken guess at the 
point-spread function affects the results considerably. 
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5. Proofs. 



5.1. Proof of Theorem 3.1. Define = ^ s'm{cete/2)\, with q as 

in (3.5); put 



Unl 



Vnl 



An 



r+l 



dt, 



An 



m'{t)\Vpn{t)} 



r+2 



dt\ 



let 7 denote a general, positive function of t = (ti, . . . , t^) that depends on 
the ij's only through their absolute values; and let Un2{'y) and Vn2(7) have 
the same respective definitions as Uni and Vni, but with |^^*(t)| and Pn{t) 
replaced by Pi{t) and pn{t)'y{t), respectively. Noting that the denominator 
contribution to V'n*(*) (3-5) satisfies 

d d d 

1*^1 ^ n l"sin(t^/2n)| < ^ n l^^l' 
e=i 1=1 i=i 

uniformly in t G An, where A G (0, ^) is an absolute constant, and with w 
denoting either w or u, we have 



(5.1) 



Wnl < sup'tt;„2(7)) 

7 



uniformly in (/> G T{A,p), where sup^^ denotes the supremum over choices of 
7 satisfying B^^ < 7 < -Bi- 

Let ait) = {l + \\t\\)-P, P2{t) = Hi finite /2)\, 



(5.2) 



Unsil) 



An 



r+l 



(5.3) Vn3{e,'y) 



m dt 



t\\<ne 



1 



{P2{t)y Pnithit)} 



r+2 



dt. 



where < e < vr. If we change variables in the integrals defining Un2 and Vn2, 
from t = {ti, . . . , tfi)"^ to s = (si, . . . , Sd)'^ , with S£ = citi where q is as in (3.5), 
and if we observe that, in the definition of Un2(7), the method for bounding 
the integral over a rectangle Yli[ncii,nc2e], where — oo < ci^ < < C2£ < oo, 
is the same as that for the integral over An, then it can be deduced from 
(3.2), (5.1), the fact that #T = O(A^), and the definition of J^(A,p) that, for 
each positive sequence En decreasing to zero, and for B\,B2> Q sufficiently 
large. 



(5.4) n'^MSSE < B2 sup'{n"'^A^a>„3(7) + vm{en,l)] + 0(?i"'^A^a^e^^'^), 



d\d 2 ^-~2d\ 
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uniformly in € J-{A,p). 
Define 

Id{e)= f (3{srHP{s)<e}ds, 

wliere e > 0, z > 0, X'rf = [0,1]'^ and I5{s) = ni<f<d|s^l- The result below 
describes the size of /^(e). □ 

Lemma 5.1. AseiQ, hie) = 0{e''+^\\oge\'^-^). 



Proof. Observe that 




< \ \oge\Jd~i{e). 



The latter inequality, and an argument by induction, imply that Jd(e) < 
|loge|'^. This bound and (5.5) establish that /^(e) < /d_i(e) + (2; + l)"^e^+^ x 
I logej"'""'^. It is readily proved that /i(e) = {z + l)~^e^~^^, and so it follows 
inductively that /^(e) = 0(e^^^| logej'^"^) as e ], 0, completing the proof of 
the lemma. □ 

Next we give a bound for Vnzi^nil)^ with t^nsl^^T) defined as at (5.3). If 
j = (ji, . . . , jrf), where each component is an integer, let the d-variate cube Cj 
denote the set of t = (ti, . . . , t^) for which each ti — j^vr G \'^)- Taking 

e = en at (5.3), we may bound the integral there by the sum, fn4(en,7) say, 
over vectors j for which ||j|| < 2ne„, of the integrals 
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In turn, t)n4(£n,7) 7) +Vn6{£:n,^), where Vn5{£n,7) equals the sum 

of Kj over the set ICn of indices j for which each > 1 and \\j\\ < 2ne„. 
Below, we shall establish an order-of- magnitude bound for sup'^ f„5(en, 7), 
uniformly in (p £ T{A.,p). Similar methods may be use to derive the same 
bound for sup'^Dn6(en,7)- 

Define si = liU-jm), C = [-\n, \t^]\ V = Vd = [0, l]'^' and I3{s) = ^ |s^|, 
the latter for s = (si, . . . , s^) € C. Since Pn{t) = hniX\i\ti\)~^\\t\\'^ , then, for 
each j G Kn and t^Cj, P2{t)y Pn{th{t) = ej{t){P{s) V KWjW^}, where Ojit) 
is bounded away from zero and infinity uniformly in such j and t. Therefore, 
defining 6n{u) = hnu'^ for n > 0, and assuming that e„ ^ so slowly that 
nsn 



i- 00, we have 

sup' Vn5{£:n,7) 



(5.6) 



o E(i + 

\jelCn 



-2p 



Pis 



sr+2 



1 2 



/ r2nen r 



ds 



1 



ds 



{/3(s)V<5„(n)}'-+2_ 

uniformly in € J^{A,p). With 5 = 6n{u) we have, uniformly in 1 < n < 2ne„, 



V 



(5.7) 



[l-{f3(s)/6r+'fms)<5}ds 

I{(3{s)<6}ds 
<const.5(l + |log5|)'^-\ 



< 



where the last inequality is a consequence of Lemma 5.1. 
From (5.3) and (5.7) it follows that, provided p > ^{d + q), 

sup'^;„5(en,7) = o|/i„(logn)'^-iy^ (1 + n)^+5-i-2p dnj 

= 0{hn{lognf-'}, 

uniformly in 1;^ G J^{A,p). An identical bound applies to sup'^i)n6(£n,7)> and 
therefore to sup'^ f;n4(en5 7) and so to sup'^ z;„3(e„,7). 



(5.8) 



sup'f„3(e„,7) =0{/i„(logn)'^ ^}. 



A similar argument shows that, with u„3(7) as at (5.2), Cn denoting the 
set of j for which each \ji\ > 1 and ||j|| < nn, and (j) = (ni<£<d Ij^I)^; 
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sup' tins (7) = O 

7 



o 



(i) 



r+2 



/{/5(s)<5n(j)}ds 



O E 0-)'^n(j)-'(log 



n 



6„ 



n 



i + E^ 



-9/3 



dx 



i=i 



where denotes the set of x G M*^ for which each > and < n, and 
the second-last relation follows from Lemma 5.1. Changing variable from Xj 
to Uj = Xj in the last-written integral, we see that the integral is uniformly 
bounded provided that q > 3d. In this case, 

(5.9) sup'^x„3(7) = 0{K\lognf~^}. 

7 

Combining (5.4), (5.8) and (5.9), we deduce that 

(5.10) n'^MSSE = 0{{n~^Xialh~^ + /i„)(logn)'^-i + n^'^X^^ale;;^^}. 

Since here can be taken to equal any sequence that converges to zero 
more slowly than n~^, then the theorem follows from (5.10). 

5.2. Proof of Theorem 3.2. Let o"^ denote the noise variance, let -^(0,1) 
be a random variable having the N(0, 1) distribution and define 

Consider the problem of deciding between ^ = and = 1 on the basis of 
the data Y{j), defined at (1.1), for j € T. This is a classification problem, 
for which the likelihood-ratio rule consists of deciding in favor of ^ = if 

E {Y{j) - {c^om)}' < E - {MU)}\ 

and deciding in favor oi 9 = 1 otherwise. From this property it can be proved 
that 

the probability that the likelihood-ratio rule decides 

(5.11) for 6* = 1 when 6* = 0, 

1/2 

or for ^ = when 0=1, equals TTn = -P(2A''(o,i) > )• 
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Define Xnli'^) — ^ '^^jXe{j /'n)e^^'^^/'^ . Using Parseval's identity and em- 



(5.12) 



(2^)V, 



n 



where h = \€'o- , &) = ^e'it/n) and cP^,\t) = Mj)e''"', 
with (j)g defined at (3.10). Using the Euler-Maclaurin summation formula it 
can be proved that 



(5.13) 



sup \xlUt)-X^\t)\=0{n'-n 

t&An 



Since X0*(O) = 1 and the definition of C2fi is equivalent to C2fiX^e{^) ~ 
then (5.13) implies that C2fi = 1 + 0(n^~^). Therefore, noting that (t^git) = 
C2,eXne(^)i see that (5.13) implies that 



sup \&)-xV{t)\=0{n^-^) 



This result, and the fact that /_^^ iV'n*!^ — imply that 

(5.14) -ll''^\ = 0{n^-P), 



where h = J_a„ IXo* - xfPlV'^^*'^ 



Observe that I2 = \c\ and 1/3^^ — /|^^| = 0(bi), where 



/3 
h 



An 



An 



hif\t/5n) + f' 

Ft ^ + * \ , f¥t 



+ f 



+ f 



Ft 



K\t)fdt 



and 61 = 2(1 - c^^^). For the choice (5„ x (cr^/n'^)^/^^'^) that we shah ulti- 
mately make, 



ci,i = {1 + f\C/Sn)}-' = 1 + Oi6P) = 1 + 0{n 



-pC4/(2d) 



) = l + 0(n 



(l-p)/2 



where we have used the fact that C4 > d{l —p ^). Therefore, |/; 
0(?ii"P), and so by (5.14), 



1/2 .1/2, 



(5.15) |/^/'-lci,i/;/^| = 0(n^-P). 

We shall assume that each q in (3.5) equals 1; the contrary case can be 



rV2| 
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treated by changing variable in each coordinate. Then 



An 



pFt 



Ft 



(5.16) 



< const. 



An 



Ft 



+ / 



5n 



+ 



n 



sin(t£/2) 



\ nsm{ti/{2n)) 



dt 



Ft 



5n 



n 



sin(t£/2) 



dt 



= const. Is, 

say. Take ^ = (27r, . . . , 27r)^. Then can be decomposed into a sum of two 
integrals, of which the first is 



An 



Ft 



n 



sin(t£/2) 



and the second we denote by /y. We shall show how to bound /ei h can be 
treated similarly. 

Let Ani be the set of points in t = (ti, . . . , t^)"'" € w4.„ for which \ti — 
27r| > TT for some £, and put An2 = \ ^ni- The contribution to Ig from 
integrating over equals 0{5'^). To bound the contribution, say Is, to 
Ig from integrating over An2-, note that on the latter set, W^t^ is bounded 
above zero. Therefore, changing variable from t to s where t = 6nS — ^, we 
obtain 



Is < const. 



n sm{te/2] 



dt 



< const. 5!? 



< const. 



f\s)f[sm{SnSe/2) 



1=1 



ds 



n 



ds = 0{6[ 



3d\ 
n ) ) 



the identity holding because p > 3d/2. Therefore, Iq = 0{5^), and an iden- 
tical bound can be derived for /y, implying that Is = 0{6^)^ and hence, by 
(5.16), that h = 0{6^f). Therefore, in view of (5.15), h = 0{6l^ + n'^-'^P). 
Since p > |C2 + 1, then, for the choice (5„ x (cr^/n'^)^/^'^'^\ and noting that 
61'^ > Cin-^^, we have n'^-'^P = 0{5'iP), and thus, h = 0{5ff). Hence, by 
(5.12), 

(5.17) 

Define 



sl = n' 
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Arguments similar to those leading to (5.15) imply that, for a constant 
C>0, 

\sn-{l + o{l)}Ciy''\=0{n^-n- 

From this property and for the choice 5n x ((T^/n'^)^/^^'^) , noting that /g x 5^, 
and also that p > \C2 + 1 [which entails n^~^ = o{5^)], it can be shown that 

(5.18) sl^5i. 

Write Pe and Eg for probability measure and expectation, respectively, 
when the true blur function is (f)g. Let vr^ be as in (5.11), and let > 0. Let 
rjj > denote a positive quantity which depends on -q but always satisfies 
<r]j < 1. Result (5.17) implies that if 

(5.19) n^a-253rf<r?, 

then ^ < 7r„ < |(1 + rji). Hence, by (5.11) and the Neyman-Pearson lemma, 
if 6n is any data-determined rule for deciding between 9 = and = 1, 

(5.20) liminf{Po(^n = 1) + A(^n = 0)} > 1 - 

n — >oo 

For the given the estimator define = if 

E to-0o(j)P< E \Hj)-M3)\', 

and put Bn = l otherwise. Then 

SSE„, = E \4>ij) - Mj)\'' > \l{On ^ 0)n-^sl 

where the inequality follows from the triangle inequality. Therefore, 
sup n'^MSSEne = sup n'^EgiSSEne) 

9=0,1 6»=0,1 

>\sl sup Pe{0n^0) 

9=0,1 

>i4{Po(^n = l) + Pl(^n = 0)}. 

This result and (5.20) imply that there exists Bi> such that 

(5.21) liminf sup n'^MSSE^e > Bi. 

e=o,i 
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If we choose 5„ x {a'^/n'^y-/^^'^\ and such that (5„ < (77cr^/n'^)^/'^^'^\ then 
(5.19) holds and, using (5.18) to get the first inequality, < 828"'^ < 
BsifTn/n'^)'"^^^ ■ It follows from this result and (5.21) that 

liminf(n'^/cj2)i/3 sup n'^MSSE„e > BiB^^, 
which implies (3.11). 
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